Stacks, Queues, Lists and Hash Tables

Data Structures And Number Systems
© Copyright Brian Brown, 1984-1999. All rights reserved.

Part 5

STACKS
A stack is used to provide temporary storage space for values. It is defined as a data structure which operates on a first in, last out basis. Its uses a single pointer (index) to keep track of the information in the stack.

The basic operations associated with a stack are,

insert (push) an item onto the stack
remove (pop) an item from the stack

The following diagram shows an empty stack of four locations. It looks just like an array, and it has an index pointer pointing to the beginning of the stack (called the TOP of the stack).


     +--------+
     |        |   <------- Stack Pointer       
     +--------+ 
     |        |         
     +--------+ 
     |        |         
     +--------+
     |        |          
     +--------+

Pushing items onto the stack
The stack pointer is considered to be pointing to a free (empty) slot in the stack. A push operation thus involves copying data into the empty slot of the stack, then adjusting the stack pointer to point to the next free slot.


   Module Push
        stack[stack_pointer] = data;
        stack_pointer = stack_pointer + 1;
   Module End

Lets push the value 45 onto the stack. [Note: The stack could hold any data type, not necessarily decimal numbers. We have used numbers for simplicity]. The stack now looks like


     +--------+ 
     |   45   |         
     +--------+
     |        |   <------- Stack Pointer       
     +--------+ 
     |        |         
     +--------+ 
     |        |         
     +--------+

Note how the stack pointer has been adjusted to point to the next free location in the stack. [Note: for the time being we are ignoring certain problems. We will address these shortly!!!].

Removing items from the stack
To remove an item, first decrement (subtract 1 from) the stack pointer, then extract the data from that position in the stack.


   Module Remove
        stack_pointer = stack_pointer - 1;
        data = stack[stack_pointer];
   Module End

Time now to address the problems of the above implementation
There are a number of problems associated with the above routines for pushing and removing items.

the push module does not check to see if the stack is full
the remove module does not check to see if the stack is empty

There are a number of solutions to this problem. We shall present a simplified solution. We do not argue that it is the best, just that it is one of a possible number of solutions.


   Comment: Assume that the array elements begin at 1
   Comment: Assume that the maximum elements of the stack is MAX

   Var: stack[MAX] : Integer;

   Module Initialize
        stack_pointer = 1;
   Module End

   Module Push
        if stack_pointer >= MAX then
           return error
        else begin
           stack[stack_pointer] = data;
           stack_pointer = stack_pointer + 1;
        end
   Module End

   Module Remove
        if stack_pointer <= 1 then
           return error
        else begin
           stack_pointer = stack_pointer - 1;
           data = stack[stack_pointer];
        end
   Module End

QUEUES
Everybody has experience with queues as they are common place. Queues occur in cafeterias, petrol stations, shopping centres, anyplace where many people (customers) line up for few resources (cashier's, salespeople, petrol pumps etc).

The purpose of a queue is to provide some form of buffering. Typical uses of queues in computer systems are,

process management
buffer between the fast computer and a slow printer

A queue is defined as a data structure which holds a series of items to be processed on a first in first out basis (though some queues can be sorted in priority). The operations associated with a queue are,

initialize the queue
insert an item in the queue
remove an item from the queue
count the number of empty slots in the queue
count the number of items in the queue

The following diagram shows an empty queue. It is identified as a series of ten locations, and two pointers named front and rear. These two pointers keep track of where the front and rear of the queue is.


	  1   2   3   4   5   6   7   8   9  10 
	+---+---+---+---+---+---+---+---+---+---+ 
	|   |   |   |   |   |   |   |   |   |   | 
	+---+---+---+---+---+---+---+---+---+---+ 
	+---+ 
	| 1 | Front 
	+---+ 
	+---+ 
	| 10| Rear 
	+---+

The front pointer is used to delete items, and the rear pointer insert items. Inserting two items called A and B will rearrange the queue to look like,

 
	  1   2   3   4   5   6   7   8   9  10 
	+---+---+---+---+---+---+---+---+---+---+ 
	| A | B |   |   |   |   |   |   |   |   | 
	+---+---+---+---+---+---+---+---+---+---+ 
	+---+ 
	| 1 | Front 
	+---+ 
	+---+ 
	| 2 | Rear 
	+---+

The pseudo-code for the various queue operations follows.

 
	module initialize 
		count = 0 
		head = 1 
		rear = size of queue 
	end module initialize 
 
	module insert 
		if count = size of queue 
			queue is full and do not insert 
		else 
		begin 
			increment count 
			increment rear 
			if rear > size of queue 
				rear = 1 
			endif 
			insert data at queue[rear] 
		endif 
	end module insert 
 
	module remove 
		if count = 0 
			queue is empty and do not remove 
		else 
		begin 
			data = queue[front] 
			decrement count 
			increment front 
			if front > size of queue 
				front = 1 
			endif 
		endif 
	end module remove 

	module count 
		return count 
	end module count 
 
	module free_space 
		return queue.size - count 
	end module free_space

The implementation of this is left to the student as a programming exercise.

LINKED LISTS
Data structures can be arranged in memory in a variety of different ways. Each method has advantages and disadvantages. Arrays for example seem easy to use, where each element is stored sequentially in memory.

This type of approach is not efficient in larger computer systems, where a number of users share main memory. In such circumstances, there may not be enough contiguous main memory left to hold a sequentially stored data structure like an array (but there could be enough if all the small free blocks were moved into one large block).

One way to overcome this is to link all elements of data together. Data is arranged into records, and a special item is added to each record called a pointer (a link field), which contains a link to the next record etc. Renaming each record as a node and we have a complex data structure called a linked list.

A linked list is a series of nodes connected together via pointers. Pictorially, it looks like,

 
	 Node 1          +---+ Node 2        +---+ Node n 
	+--------+--+    |  +--------+--+    |  +--------+--+ 
	|        | ------+  |        | ------+  |        |  | 
	+--------+--+       +--------+--+       +--------+--+ 
	   Data   Link        Data    Lnk         Data    Lnk

In Pascal, a node definition looks like,

 
	type	ptr = ^node; 
		node =	record 
			data : string[20]; 
			next : ptr; 
		end;

The following Pascal program declares three nodes, inserts data into each of them, then using a pointer, cycles through the chain from node to node printing out the contents of each node. The value 0 is always stored in the last node of a chain, to indicate the end of the list.

 
	program linked_list; 
	type	ptr = ^node; 
		node = record 
			data : string[20]; 
			next : ptr; 
			end; 
	var 
		node1, node2, node3 : node; 
		nodeptr : ptr; 
	begin 
		node1.data := 'Welcome '; 
		node1.next := @node2; 
		node2.data := 'to '; 
		node2.next := @node3; 
		node3.data := 'Linked Lists.'; 
		node3.next := nil; 
		nodeptr := @node1; 
		while nodeptr <> nil do 
		begin 
			write( nodeptr^.data ); 
			nodeptr := nodeptr^.next; 
		end; 
		writeln; 
	end.

Linked lists are used in a wide variety of applications.

directory structures
operating systems for process management
data management in data base systems.

An advantage of storing data using a linked list is that the key sequence can be altered by changing the pointers, not by moving data nodes. This means sorting data is quick, but searching is slower as you must follow the chain from node to node.

HASHING
Hashing is a technique used to improve data access times. Rather than sorting the data according to a key, it computes the location of the data from the key. In simple terms, it computes an index value from the key of the desired record. At that index value, the record is stored/retrieved.

If we had an array of 1000 elements, we would expect our hash function (the method used to calculate the index value) to evenly distribute the keys over this range. In practice this is difficult to achieve. It is possible that two different search keys might generate the same hash value (ie, index), and we call this a collision.

The size of the table (array) is generally a prime number, as this has the least probability of creating collisions.

The following code segment defines an array of records in Pascal which will be used to demonstrate hashing.

 
	const	size = 511; 
	type	data = record 
			name : string[20]; 
			age : integer; 
		end; 
		hashtable = array[1..size] of data;

Next, lets initialize the table with zero's, so this makes it easy to see if information is already stored at a particular element (important when inserting data later).

 
	procedure initialize ( var Table : hashtable ); 
	var i : integer; 
	begin 
		for i := 1 to size do 
		begin 
			Table[i].name := '                    '; 
			Table[i].age := 0; 
		end; 
	end;

The procedure insert will take the hash value and the data and insert it into the table. If the position is already occupied (ie, a collision occurs) the table will be searched from that point onwards till an empty slot occurs.

 
	procedure insert ( Position : integer; Element : data ; var Table : hashtable ); 
	begin 
		while Table[Position].name[1] <> ' ' do 
			Position := (Position + 1) mod size; 
		Table[Position] := Element; 
	end;

The procedure hash will generate a hash value from a given data record. In deciding this, it must generate a value between 1 and 511. Obviously, we cannot use the age field of the record, this will generate too many collisions. The best method is to add up the value of all the characters in the persons name, then extract nine bits from it (2^9 = 512) to generate the index value.


	procedure hash ( var Position : integer ; Element : data ; var Table : hashtable ); 
	var i : integer; 
	begin 
		i := 1; 
		position := 0; 
		while i < 20 do 
			position := position + ord(Element.name[i]); 
		position := position mod 511; 
	end;

The remaining procedures to search and delete are left as a programming exercise for the student.

INDEXED SEQUENTIAL
This method seeks to make a direct access file appear like a sequence file. The sequencing is achieved via an index table, which holds the keys of the records and their position within the file.

A program reads the index sequentially, and when it locates the key it desires, it reads the record from the indicated position.

The advantages of this method are,

the records need not be in key sequence
the records may be processed sequentially or directly

FILE MERGING
This is the process of merging two or more input files into a single output file (which are normally all sequential in nature). The files to be merged are first sorted using the same key sequence, which is preserved in the output file.

A pseudo-code example for a 2-way merge is shown below.

 
	program two_way merge 
	begin 
		read 1st record of infile1, name as rec1 
		read 1st record of infile2, name as rec2 
		while rec1 or rec2 is not an end-of-record 
		begin 
			if rec1.key < rec2.key 
			begin 
				write rec1 to outfile 
				read new rec1 from  infile1 
			end 
			else 
			begin 
				write rec2 to outfile 
				read new rec2 from infile2 
			end 
			endif 
		endwhile 
		write end-of-record to outfile 
	end program two_way merge

Home | Other Courses | Feedback | Notes | Tests